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These issues, quite common in other modular software applications, become more and more dif- 
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1. Introduction 



In the construction of complex modular simulation systems, as the one used in High Energy 
Physics, the question of dividing the system to functional parts is essential. In the modular ap- 
proach, the problem can be divided into parts, and each part can be addressed by different re- 
searchers or teams. However, such approach is only possible if the mathematical structure of the 
problem have certain algebraic properties. 

In practice, the module-based work model is often an idealization: creating scientific software 
can not be separated from the research itself. As a consequence, the architecture of programs needs 
to be modified at various steps of projects development, for example to accommodate more precise 
models. 

Another class of difficulties in the development of a simulation segment arise from the con- 
straints, imposed by other segments of the large project, namely in the definition of data structures 
and interfaces. 

Finally, the demands of the end-users, their expertise in handling simulation blocks (which is 
often limited) and understanding the conceptual models used are of critical importance. 

These types of difficulties are quite universal for any complex, scientific problem. In the fol- 
lowing let us concentrate on a relatively simple model (yet already quite sophisticated) for the sim- 
ulation of QED radiative corrections in decays, and the corresponding simulation package PHOTOS 
[|l|-||]. We believe that presented "development drama" can be of interest not only to the readers 
interested in QED bremsstrahlung, but also in general case. In this respect our presentation can be 
understood as a summary and invitation to reading 

Our contribution is organized as follows. We start with the presentation of the PHOTOS algo- 
rithm in section 2. We highlight the role of the event record in program's construction. We will 
also specify those properties of event record that are necessary for the basic precision level, that is 
essentially as PHOTOS design in years 1991-1994. 

Section 3 is devoted to the description of recent years' developments in the PHOTOS algorithm, 
which lead to the improvement of its performance, but at the same time introduced more strict 
constraint on the event record used as a data source. We will present our efforts in preventing the 
complications for other users of data structures. 

During the years of project's evolution special techniques devoted to detect and resolve var- 
ious types of difficulties related to event record(s) were developed. Section 4 will be devoted to 
their presentation. Internal instrumentation utilities of PHOTOS: debugging subroutine PHLUPA and 
kinematic rounding error correction subroutine PHCORK will be discusses. Then, MC-TESTER 
will be briefly presented. Finally, we will justify the need to duplicate the event information which 
is originally stored in a standard event record HEPEVT. 

Section 5 will be devoted to the discussion of the issues related to event record standardization. 
The question of porting the algorithm to C-i-i- - the process technically completed in 1999 - 
will be covered in that section as well. We will stress the important, yet often ignored, aspect of 
event record construction: the structure must not only be convenient and flexible from the software 
engineer point of view, its contents must also be clear, from the point of view of involved physics 
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models, to the end-users processing the data produced by Monte Carlo event generators. This is 
particularly important if extensions of the standards are to be not only proposed but also used. 
The summary, Section 6, closes the paper. 



2. Basic design of PHOTOS 



Already at an early step of preparation for the x-lepton polarization measurement at LEPl it became 
evident that bremsstrahlung corrections in x decays are necessary for proper modelling of 
theoretical predictions for measurement of the Z boson couplings using x-polarization. For that 
purpose, special routine RADCOR was designed. With certain probability it was replacing the decay 
products of X simulated using TAUOLA by the ones with an extra photon added. This action was 
performed in a strict environment of the explicit list of momentum four-vectors and in the rest- 
frame of X. In fact, the parameter defining the actual decay mode of x was also passed into RADCOR, 
identifying the physical process. 

The X decay products, modified in such way, were passed into the TAUOLA interface to KORALZ 
[10], which was the Monte Carlo program for x pair production at LEPl energies. 

This awkward, yet useful, design was only tailored for a single application, and was missing 
the documentation. The limits of its reliability were not explored at all. Indeed, the fact that it 
could properly simulate the leading-log parts of the first-order corrections was enough. The issues 
of dependence on the choice of the gauge were not risen, nor were the question of the phase-space 
coverage. 

However, this exercise provided an important observation: to simulate the dominant part of 
QED corrections it is sufficient to search through the full event structure, identify the branchings 
corresponding to elementary decays of some particles, extract the information describing all decay 
products, and apply a routine such as RADCOR. In the early 90's and late 80's the HEPEVT event 
record was providing the complete environment that could allow for extension in use of RADCOR 
routine. The first version of PHOTOS ^ was created. Its design relied heavily on the assumption 
that the HEPEVT event record hosts a tree-like structure and that all pointers to decay products 
(daughter pointers) and origins (mother pointers) are defined and consistent. This assumption was 
behind other projects realized by Bob van Eijk at that times, as well, see eg. [11]. At each decay 
spUtting the energy-momentum conservation was supposed to be fulfilled exactly, even if only with 
a single-precision computer arithmetics level. 

The PHOTOS generator gained popularity, and in the documentation of its 1994 version a 
multitude of tests were presented for physical processes for which the theoretical predictions were 
available. The corrections for double bremsstrahlung were added as well. Further development 
was stalled due to limited intersest in improvements, expressed by experimental users. Numerical 
problems blocked development as well. It was only almost 10 years later, when the origins of 
theses problems were identified and corrected since then, significant improvements could be 
introduced into the program. 
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Figure 1: The comparison |Q| of the standard PHOTOS (with multiple photon emission) and the KKMC gen- 
erator (with second-order matrix-element and exponentiation). In the left frame the invariant mass of the 
^+^" pair; SDP= 0.00918. In the right frame the invariant mass of the yypair; SDP=0.00268. The fraction 
of events with two hard photons was 1 .2659 ± 0.001 1 % for KORALZ and 1 .2952 ± 0.001 1 % for PHOTOS. 



3. Toward high precision in PHOTOS 



Already in it was found, that if the algorithm for the single-photon generation is properly iter- 
ated, the leading corrections of the double-photon emissions can be incorporated as well. In 
this iterative solution was extended to multiple-photon emission and the tests for leptonic Z decays 
have shown that this solution reproduce the results for the final-state bremsstrahlung of the KKMC 
Monte Carlo program [12] with amazing precision. Few years earlier, other tests and extensions 
important for Higgs or W decays were introduced also [13]. The good performance of the program 
was related to other improvements, introduced at that time and valid for all decays. For example 
the implementation became exact in the soft-photon limit. These improvements, however, are of 
limited importance from the point of view of software organization of PHOTOS interface with other 
packages. 



Let us now concentrate on another class of corrections, discussed in refs. [|4[ |14p and corre- 
sponding to the implementation of the exact, first-order matrix-element kernel in bremsstrahlung 
correction generation for Z l^l^ and B — > K{%)K{%) decays. In the latter case, all possible 
combinations of charges and the replacement of % with K were used. 

Let us start with the presentation of the numerical results of the test for Z f^^f^^ decays. 

In the first figure we show the comparison of the two plots giving largest discrepancies be- 
tween PHOTOS (run with standard options) and KKMC (run with second order matrix element and 
exponentiation). The plots present the invariant mass of the fi^in^ pair and the invariant mass of 
the two hardest photons for the events with at least two photons of energies above 1 GeV in the 
rest frame of the Z. The rates for the event samples predicted by the two programs are given in the 
figure's caption. As one can see the agreement between KKMC and PHOTOS is better than 0.1 % (if 
calculated with respect to the total Z decay rate), yet the differences are still visible from the results 
of simulations with 10^ events. If, as in figure 2, the complete NLO kernel is activated in PHOTOS, 
the differences get reduced by about a factor of 50! This is indeed interesting results of ref. [H]. 
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Figure 2: The comparisons [Q] of the improved PHOTOS (with multiple photon emission) and the KKMC 
generator ( with second order matrix element and exponentiation). In the left frame the invariant mass of the 
^1+^1- pair; SDP= 0.00142. In the right frame the invariant mass of the yy; SDP=0.00293. The fraction of 
events with two hard photons was 1.2659 ± 0.0011% for KORALZ and 1.2868 ± 0.001 1% for PHOTOS. 



At this point it is important to realize what is the price to pay for such improvement. Certainly, 
it is not the computer time - it remains small and the samples of order of 10^ could easily be 
simulated overnight (PHOTOS is in fact significantly faster than KKMC). To answer this question one 
has to recall the formula for the final Monte Carlo weight in PHOTOS. 

Let us write (separated from the phase-space Jacobians) the explicit form of the real-photon 
matrix element, as used in the standard version of PHOTOS (as published in [|l|, ^]) for the e^e^ — > 
Z°/y* — > /^+/^^(y) process: 
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For its calculation (with respect to Bom cross-section) it is enough to know the four momenta of 
the Z and its decay products. In the presented formulae we follow the notations from refs. p^ ]. 



This expression is to be compared with the exact one, taken from ref. [ ]150 
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The resulting weight is rather simple, and reads: 
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For its calculation the numerical value of the electroweak couplings of Z to fermions, as well as 
information on the state from which the Z was produced is nonetheless necessary. This seemingly 
trivial requirement puts new requirements on the event record: the details of the process of the Z 
productions need to be coded in the event record, then correctly deciphered by PHOTOS to calculate 
the process-dependent weight. From our experience this requirement of PHOTOS may be difficult 
to accept by other users of event records. The authors of event generators often choose their own 
conventions in encoding the details of hard process such as qq ngZ/'f;Z/'f — into the 
event record. 

The NLO solution for PHOTOS would therefore be feasible with some universal, standard event 
record, nonetheless difficult due to practical issues of interfacing. However, as can be seen from 
the figures, the NLO precision in PHOTOS for today and tomorrow experiments is most likely not 
required. For the time being the problem remain rather academic. 



In ref 1 14], we presented similar modifications in the PHOTOS kernel for the decay of B mesons 
into a pair of scalars. As one can see from the comparison of plots in figures 3, 4 and 5 the imple- 
mentation of the exact (scalar-QED only) kernel brings a minuscule improvement in the agreement 



between PHOTOS and the reference exact simulation of SANG 016|]. In this case both: SANC and 
PHOTOS are used to simulate single photon emission (There exists no reference simulation with 
which the multi-photon version of PHOTOS could be compared.). 

For the NLO kernel in PHOTOS the results are indistinguishable from those of SANC, even at 
statistical level of 10^ events. In this case, the technical price seems to be zero, as there is no need 
for extra information to be pumped from the event record to the calculation of the PHOTOS weight. 
Actually, the exact kernel is even simpler than the one used so far. 

However, this gain may be elusive: the dependencies on the production process may appear if 
form-factors (originating from some unspecified here models) are fitted to the data. 

From the practical side, one can interpret this excellent agreement as a strong test of numerical 
performance of the program. The necessary studies of the exact parametrization of the phase space 
used in PHOTOS, which will also be important for future version of PHOTOS, are described in detail 
in the journal version of ref. [0]. 



4. PHOTOS debugging tools 

During the years of PHOTOS development, various software-related problems needed to be faced by 
its authors. The majority of the workload (and actual lines of code) needed to be devoted to the 
treatment of the data stored in a "standard" HEPEVT event record. 
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Figure 3: Results [ p4[ | from PHOTOS, standard version, and SANC for Z?" K^K^ (y) decay are superimposed 
on the consecutive plots. Standard distributions, as defined in the text and logarithmic scales are used. The 
distributions from the two programs overlap almost completely. Samples of 10^ events were used. The 
ultraviolet scale, ^y^, was chosen to leave total decay width unchanged by QED. 



In an ideal situation, if all rules of the HEPEVT standard definition were respected, PHOTOS 
would easily identify the branching points (particle decays with charged products) in the decay 
tree, extract the required data, then eventually append the generated photons (if any) as additional 
decay products. It would also traverse the decay tree to identify all possible places where QED 
corrections might need to be generated. Initially, the algorithms employed in PHOTOS assumed that 
the data structure is consistent, with all pointers (to "mothers" and "daughters") set up correctly. 
An acyclic tree of 1 — > « (or exceptionally 2 — > «) processes could have easily been navigated using 
standard algorithms. 

However, the rigidness of the HEPEVT standard, and the lack of possibility of extending it in a 
consistent way forced the authors of event generators to overload the meaning of the elements of the 
HEPEVT data structure. Certainly, one could consider having bi-directional relations (i.e. mothers 
pointing to daughters, and daughters pointing to mothers" as redundancy, and the place where 
additional information (such as spin or colour flow) may be stored instead. The meaning of the 
pointers become generator-specific and the navigation in such data structure could not be performed 
by a generic algorithm. Pandora's box of event-record problems has been opened, hurting mainly 
the coordinators of the large experimental simulation chains. 
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Figure 4: Results 0] from PHOTOS, standard version, and SANC for ratios of the B° ^ 71 ^"*"(y) distri- 
butions are presented. Differences between PHOTOS and SANC are small, but are clearly visible now 



The pointers in the HEPEVT structure were not the only element that became non-standard. 
Due to evolving needs of physics models (such as bigger number of particles being simulated, or 
precision), HEPEVT data structure has been modified to store single- or double-precision data, with 
various array sizes. Dubious matching of the HEPEVT layout between simulation blocks became yet 
more complicated. 

To alleviate the problem with varying precision and layout of HEPEVT, PHOTOS has been 
equipped with a set of debugging and data-interpretation facilities. Firstly, it was modified to 
work on a local copy of the event record (the layout of which followed the "well-behaved" HEPEVT 
standard), and have a set of functions that would transfer the data between whatever external variant 
of HEPEVT data structure, and the internal storage. Secondly, a set of sanity-checking and pointer- 
reconstructing procedures were applied during the transfer of the data between the external event 
record and the internal one. Finally, a debugging function PHLUPA was provided. It prints out the 
data as interpreted/modified by PHOTOS routines, at different steps of event construction. 

Because of the specifics of the PHOTOS algorithm, namely massive search and modification of 
the complete event tree, PHOTOS itself has became a debugging tool for large simulation chains in 
the experimental collaborations. Strengthened with its debugging tools, it helped to identify many 
problems related to event grammatic. 
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Figure 5: Results [|I4|] from PHOTOS with the exact matrix element, and SANC for ratios of the — > 7t K+ (y) 
distributions. Differences between PHOTOS and SANC are below statistical error for samples of 10^ events. 



The numerical stability of PHOTOS was, for many years, a problem faced by its users (and 
authors). As the energy ranges, to which it was applied, were raising towards the ones of the 
LHC, the problems became more and more severe. Again, the event record data, filled by other 
generators, was often the main culprit of these instabilities. PHOTOS requires that the Energy- 
Momentum conservation in a decay process is absolutely respected, and with a numerical high 
precision, otherwise the boost operations performed between the laboratory rest-frame and the rest- 
frame of highly energetic particles cannot be calculated. To deal with the problem of insufficient 
precision of the energy-momentum conservation in the event data, the PHCORK routine was added 
to PHOTOS. It verified (and corrected) the kinematics of particles being processed by PHOTOS, so 
that the four-momenta of children sum up to the four-momentum of the mother particle, and the 
— = invariant was preserved for all of particles. From physical point of view, this latter 
relation contain ambiguity: for wide resonances (particles the mass of which has wide spectrum) 
one might speak about "off-mass-shell" particles, for which the mass doesn't need to match this 
equation... A few modes of operation of PHCORK were prepared to deal with various scenarios, and 
interpret the data correctly. 

To be able to explore the potential of the PHOTOS algorithm, and to make the debugging of the 
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event record data easier, a tool: MC-TESTER originally developed for tests of TAUOLA was adopted. It 
performs comparison tests of distributions of invariant masses produced by two, or more, (versions 
of) event generators. In semi-automatic way (thus eliminating the risk of accidental programmatic 
error) it extract the data from event records filled by an event generators: it identifies the decay 
modes of a given particle, and for every mode it builds the histograms of invariant masses of 
all combinations of decay products. After the data from two runs of MC-TESTER - instrumented 
generator is completed, the data collected by MC-TESTER in the runs is compared, and presented 
in a visual form of plots. For each plot, corresponding distributions from the two runs are plotted, 
and the ratio of the distribution is overimposed, giving an overview of discrepancies between the 
corresponding distribution. A "Shape Difference Parameter" (SDP) is also calculated, to quantify 
these differences. All the plots, with the values of the SDP and branching ratios for all identified 
decay channels are presented in an easy to navigate, printable "booklet". 

The results of the MC-TESTER - based comparisons of PHOTOS with high-precision Monte Carlo 
generators (for processes such as leptonic Z decays) were actually the motivation to improve the 
precision of PHOTOS. 

5. Challenges of event record 

On reading the paper, one could get an impression that the communication between the mod- 
ules of the simulation tree is a challenging, yet standard, goal, which could be realized in any 
modern, or even not so modern software environment. Our discussion in the previous chapters 
pointed to possible constraints and requirements in the organization of such data structure imposed 
by relatively modest application PHOTOS, if its precision requirements would need to be increased 
beyond certain level. 

One can ask the question why the C++ implementation of PHOTOS did not meet so far as much 
attention as its seemingly obsolete FORTAN version. On a first sight the answer is simple: there 
was until recently no commonly agreed standard for the event record data structure in C-i-i- accepted 
by the dominant part of the community. Recently, it seems that the HepMC [ [T7| ] structure is gaining 
popularity in the LHC applications. 

From the past experience of HEPEVT standard, one could postulate that a viable event record 
could be seen as a system of parallel trees, as shown in Fig.6, the nodes of each being bound 
with correspondence relations spanning across the layers. Each individual tree should be easy to 
investigate or modify by program such as PHOTOS, or the detector simulation software. 

We believe that the universal event record of the future should be kept independent of the 
theorist applications, or - better - from any direct applications. That has been the case for the 
JETSET/PYTHIA family of programs, which had their internal event structure, and a translation 
routine allowing to transfer the data from/to the universal HEPEVT. The authors of PHOTOS have 
also realized the advantages of such approach. 

In the last part of this Section let us show our approach to the event record problem as used in 
our MC-TESTER tool. 

For MC-TESTER to be effective in comparing the results generated by various Monte Carlo 
simulators, it was essential to provide access to various standards (or: flavors) of event record data 
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Figure 6: Event Record as inter-linked layers of trees. 
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Figure 7: The concept of the HEPEventLib interface. 



structures. Similarly to PHOTOS, MC-TESTER performs exhaustive search and data extraction from 
event record, it however doesn't need to modify the contents of the event record. Typical data 
that needed to be extracted was the mother-daughter relationships (including finding out the non- 
decaying final-state particles in the cascade decay), and determining the properties (four-momenta, 
and type) for involved particles. To separate MC-TESTER from problems related to event record 
processing, the HEPEventLib abstraction layer was created (see Fig. 7). 

At the technical layer, HEPEventLib does nothing more than interpretation of the data stored in 
various event record standards, and providing these data in a consistent form to the main program, 
hence hiding all dependencies and data-translation operations. The data is provided by abstract 
object representing a particle, a list of particles and an event. As no modification is performed 
on the structure of the event record, the properties visible in a "particle" object may be mapped 
directly to the corresponding data in the underlying event record. Thanks to that feature, particle's 
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properties (such as four-momenta, but not the mother-daughter attributions) might even be modified 
from within HEPEventLib's abstract view, and the changes would be propagated to the actual event 
record in a consistent way. 

At the bottom of the HEPEventLib there are implementations of the HEPEventLib abstraction 
to concrete event record types, such as HEPEVT, LUJETS, PYJETS, and HERWIG-specific version of 
HEPEVT. New "backends" for any (future) event record may be implemented as needed - MC-TESTER 
will automatically profit from the new standard with no need of adding a single fine of code in it. 

The use of HEPEventLib is currently hmited to the MC-TESTER project. However, it may also 
be used by any other project, without introducing any dependencies on MC-TESTER. We beheve 
that similar "screening" (or "interpreter") approach, for separating-out the dependencies on the 
external event record from the main code of simulation may provide substantial simplification of 
the code. It also interplays in a natural way with the approach of having an "external event record" 
for data-exchange with other simulation modules and "private event record" to keep the state of the 
simulation while a new event is being generated. 



6. Summary 

In the present talk we have reviewed the basic properties of theoretical (QED) and software envi- 
ronment which is at foundation of design and performance of PHOTOS Monte Carlo program for 
simulation of QED bremsstrahlung corrections in decays of resonances and particles encoded in 
different type of event records used in simulations of High Energy physics. This presentation may 
be of interest not only for the program users (present and future ones) involved in experimental 
data analysis and organizing software for such analysis, but also for people interested in similar 
software organization problems appearing in other applications and their matching with software 
environment. 
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